Good afternoon, everyone. My name is Ming Chow, and I will be speaking today about NoSQL
databases. How many people here are using a NoSQL database such as Mongo, Redis, Cassandra,
HBase? Many, many to name. Okay. How is your experience so far with them? Okay. Yeah. So
far, so good. They're fast. They're transactional. They're very easy to use. You don't need SQL
to use them. You know? And if you want to insert data, search for stuff, it's all based
on the computer science principle of key value pairs. Okay? So if you've never seen a Mongo
database or a NoSQL database, typically how you want to find data is you want to find
something. I'm connected to a financial news database on Mongo right now. But if you want
to find something, it's going to be something like the database, the name of the collection,
then the find routine. And typically it would take in JSON. So the key is going to be screen
name. Let's say for the screen name is going to be CBS news. Okay? So what I'm going to
do is I'm going to go to the database. I'm going to go to the database. I'm going to
do here just a very simple example is to show how you find all financial news that's
from CBS news on Twitter. And so what happens is those are all your results. Okay? So really
nice and easy. But that's only just one way, one of many ways to search for stuff in a
NoSQL database such as Mongo. What about security of NoSQL databases? That's another story.
That's all over the place.
Right now we have a mixture of heterogeneous and homogenous security issues. Okay? And
that's what I'm here to talk about. Okay? I'm actually very surprised that the topic
of just NoSQL database has never, ever been covered here at DEF CON. Two years ago I talked
about building, you know, the issues of using HTML5, which is used to build a database.
To build things on the application side. There's actually just a lot too, just the
database side of things. And a lot has changed in two years. But one thing that hasn't changed
is we're all still new to NoSQL databases. You know, we're all new to this. And the
only thing largely a lot of us care about is just making it work. Just making it work.
Just making it work. And, of course, that certainly ‑‑ that has some, you know,
You know how usually that goes, especially if you leave security into the hands of developers.
So a homogenous problem, and a very simple one right off the bat.
If you know the database vendor, you know the IP address, you know the port number,
you've almost won the game.
Okay?
Why?
Why is it just knowing just the IP address, the database vendor, and the port number is
good enough?
That's because of this next thing, which is authentication and encryption.
It's almost nonexistent or extremely weak.
If you use many Node ‑‑ if not all Node SQL databases out there, if you take them
out of the box, you take them out of the box, administrative user authentication turned
off.
Okay?
Turned off.
Even ‑‑
Even if they do support features such as encryption and auditing, not only do you
have to turn them on yourselves, but also the, you know, the scheme is really weak.
Just for example, Mongo still uses MD5 weak salts in CouchDB.
If you ever read the documentation of Mongo or Couch or Redis or Cassandra, there is this
one line which I find very surprising.
If there's one thing in common with each and every one of these systems, we urge you to
use this database system in a trusted environment.
That's from the documentation.
Read the documentation.
It's quite mind boggling.
It's security.
Security is the complete afterthought. Look, how big is, you know, how big is NoSQL databases
out there right now? Well, if you do a search on Shodan, right now, if you do a search on
Shodan, it's 40,000 instances of Mongo that are out there, it has, and there are also
20,000 instances of Redis running. So it's a big deal. It's already there.
So this is a ‑‑ these are homogeneous issues that we've seen that affects all NoSQL
databases. Okay. So there's a lot of chatter on this thing known as, okay, NoSQL, I don't
know ‑‑ not only do I not need to know SQL anymore, but this whole problem that I
think you guys might have heard of called SQL injection goes away. Actually, in my humble
opinion ‑‑
Okay.
The injection problem has gotten worse. Okay? Now, okay, sure, SQL injection is gone, but
now we have three ‑‑ I say three different classes of injection attacks. Okay. One is
called schema. Now, NoSQL databases, how they work is they're based off a very dynamic
data model. Okay? If you insert a record ‑‑
Okay.
Or if you create a database that doesn't exist, automatically create it for you right on the
fly. Okay. Yeah, it goes back to the original point that these NoSQL databases are really,
really easy to use. Okay? Very, very flexible. That's a good thing. Of course, the bad thing
is, you know, you have flexible, dynamic record and data entry. Also, if you can easily overwrite
existing values for keys, very, very simply, last key wins. Okay? So I am going to show
you a few demos. Schema I'm going to do last. You can do query, many unsafe querie very
simply by string concatenation. And now this gem. I love this one. How many people are
good with JavaScript here? Okay. Learn it. Learn it. So this is a very simple electrical
Now, a lot of these NoSQL databases, they take in JavaScript functions as parameters
to search and insert, okay?
And I'm going to give you an example of using the where clause.
Now, here, I am now going to give a quick demo on where this works, okay?
Search by handle.
In this example.
So what I've done in this example is I've created a new search system, okay?
There's a whole bunch of Twitter handles that are used by the Bloomberg terminal, and I
have actually stored 4,000 tweets in all.
But let's say that I know that one of the Twitters on the Bloomberg handle is VentureBeat.
So if I type in VentureBeat, hit search, okay.
This is a collection of all the news that have returned by VentureBeat ‑‑ that have
been tweeted out by VentureBeat for, I don't know, a few days.
Okay?
All right.
Works well.
CBS News.
And so we have 208 items, okay?
Now, how can we beat this system?
One thing is, what we can do is if you want to see more records than you want, okay, and
PHP is available.
It's a very interesting beast, working with Mongo databases.
Let's put in for this query parameter known as search box, we add square brackets, dollar
sign N E. Dollar sign N E in Mongo means not equal to.
You can use dollar sign N E to search for things that are not equal to something.
Now what PHP does, okay, what PHP does.
It ‑‑ N E.
N E, there are so many inputs that are within square brackets.
They are automatically converted to an associative write format.
So how you're going to read this is, okay, so what this now ‑‑ this query will do,
the original stuff I showed you was, give me everything that is CBS news or VentureBeat.
Now we just did ‑‑ we just modified the query, and we just changed it on the fly,
and we've said, okay, give me everything that is not equal to C.
Hit search.
CBS News. Hit enter. Now we have all these records, all these news items that are from
sources on Twitter that are not CBS News. Okay? We've returned back everything. So what's
the culprit here? What's the culprit? So if I can show you the source search by handle.php.
And I'm going to show you the line, that one right there. Collection find erase, you know,
search for screen name equals something. Now, remember what I said. If you use square brackets
for your query, you're going to get the same thing. You're going to get the same thing.
If you use square brackets for your query parameters, that stuff will be in ‑‑ that will be
translated into an associative array. So what this will do will be the associative array will be
screen name and then arrow, the value will be in an array ‑‑ associative array format. Not
equal to is the operator. And, of course, what did I use? I think I used CBS News. Okay? So
now I'm going to show you an example of JavaScript injection. Okay? Search, hey, I'm going to
check my.php. Really, really plain looking box here. Now, what you can't do, I didn't
give any directions on how to use this. Okay? But what we can do is this. We can actually
use JavaScript functions. We're going to type in a few JavaScript functions. Function. Okay.
Now, let's say I want to return all the news items from ‑‑ let's say, I want to return
say NBC news. So return this.screen name. Okay. Equals equals. And of course the string
is going to be NBC news. Okay. Semicolon close the statement. Close the function. And here
we go. Return. Okay. This is what it's going to do. It's going to return all the news items
that are from CBS news. But this is using JavaScript. Let's do one more. Let's do one
more which is pretty nice, which is going to be function. Okay. Let's see if we get
everything. Can we also do other mangalings using JavaScript as well, too? Sure, why
not? How about this one, this? Okay. And then it gets right back. And now I'mreally
Now, we can do a regular expression matching. Okay? What we're going to search for is Apple.
What this is going to do, it's going to search for all the news, hey, all 4,000 plus records,
anything that has the word Apple in them. Okay? Let's do some even more crazy thing.
We can also do this. Function while 1. Print more. Actually, I'm going to put this in
source. Now, what this is going to do, oops, did I close? Nope, I'm missing one more. All
right.
All right.
It's going. It's going. I'm going to stop this. You don't need this anymore. But what
I can show you is this. If I SSH into the box, okay, probably going to get a password
error. I didn't. Okay. CD log. Okay. CD MongoDB. Take a look at what I just did in mangled logs.
Okay. And more MongoDB log. I don't like this. How about this one? How about tail?
That was from ‑‑ you know, this is one by result of using ‑‑ well, what you can
do with ‑‑ well, your ‑‑ if your query is based on ‑‑ if your injection is a
JavaScript function. Now, I only got 20 minutes for this whole talk. I just have not even
mentioned what if you do this instead of PHP if you use something like, of course, Node.js
and express. Okay? Now, let's go back to the schema attacks. How about this one? This
one. I like this. I got to show you this. So right now the server is 19%. But what if,
what if, if I run this script that I created using Ruby, okay, one of the nice byproducts,
one of the nice byproducts of all of this, of schema attack, you know, of this whole
dynamic model, okay, what it's going to do is I'm going to open up a word. It's going
to create a word list of ‑‑ a word list file. Okay? And it's going to create a brand
new database for each and every word in this file. One nice byproduct is you can exhaust
the system resources on the server, take up 100% of the space. Okay? So if you take
a look, now ‑‑ oops, not yet. Okay. We'll let this thing run.
Let this thing run. Okay? All right. Heterogeneous problems. Now, how many NoSQL databases there
are? This many. Okay? Too many to name. Now, the big problem is different database systems,
different NoSQL database systems, you're also dealing with different sets of terminology,
for example, Mongo, the whole idea of a table is a collection and the whole idea of a record,
is a document. It's completely different in Cassandra. Redis is just key value pairs.
Okay? And how about the results? I know different systems, like, for example, CouchDB, they
support different sets of outputs as well, too. Outputs that you can use JSON and even
binary JSON. So what does that have to do with anything security? We have this problem,
this infers this problem known as complexity. Okay? Now, in order to really understand the
problem with NoSQL, you've got to read each and every document in order to understand
the documentation individually, because different systems, different features, different inputs,
different outputs. Look, even MongoDB, some vendor‑specific items, MongoDB, MongoDB is
actually bound to all the interfaces when you take it out of the box. You can actually
take a look at, you know, some really cool start‑up data, such as process information,
and thus local collection. Okay? In CouchDB, HTTP is actually open by default.
All right. So how do you actually protect yourself from ‑‑ so what does this all
mean? I mean, how do you secure NoSQL databases? I hate to use this term known as defense in
depth because it's really overused. But the problem is it relies on the full perimeter.
Okay? Now, full perimeter security is really, really, really important. Okay? Configuration,
if you want to make NoSQL databases work right, configuration is a very, very important thing.
You just can't take it out of the box and expect it to use it right away. And this whole
idea of validation becomes very important. Not only are you validating inputs now, you
also have to ‑‑ you also have more things to validate in terms of inputs, including
JavaScript functions. Hey, for output, you also have to validate the binary JSON and
JSON as well, too. So validation becomes even more critical this time. Okay? So what does
this all mean? Look, back in the good old days, the only good ‑‑ the only game
in town were, what, Oracle, MySQL. You can build any application using that thing now.
But now, okay, those are not the only games in town, and you have systems such as Mongo,
Redis, Couch. You've got to use the right database for the right job, for the right
application. Okay? Yeah, so not only do you ‑‑ okay, so you can't just assume that
SQL injection has gone away. In fact, there's been many more opportunities ‑‑ many,
many more opportunities, depending on what database you're using. So you can't just assume
there's a database system that you choose. But the thing that really, really bugged the
living hell out of me are these things. Right now, no SQL databases are completely brand
new, but we have a problem right now with, A, we have technologies completely deployed
naively. They're just out there. I mean, people just say ‑‑ especially if you believe
in the hand of the developers, they just assume, okay, we're not going to get hit, we're just
going to put it out there and use it. No, that's not the way how it works. So now you
have the technologies being deployed naively, and you're going to have to go back to the
database. And one last thing. A lot of people use no SQL databases, I think is the word
on the street, so we can get away from this whole idea of a database administration. Well,
the DBA, the death of a DBA had been greatly, greatly exaggerated, because now they have
even more ‑‑ there's even more opportunities out there. You just have to read the documentation
and, you know, for what this database system would support. Okay? So those are my points.
And that's all that I have.
Let's see if there's anything else.
This thing actually ‑‑ just run. Nope, still running. Still running. Still running.
I don't know what happened to it. But what it will do, this thing will just exhaust 100
percent of the disk space on the server that I have. So that's all I got. Okay? Thank
you guys so much.
There's a lot.
